Dialog-context dependent language modeling combining n-grams and stochastic context-free grammars
نویسندگان
چکیده
In this paper, we present our research on dialog dependent language modeling. In accordance with a speech (or sentence) production model in a discourse we split language modeling into two components; namely, dialog dependent concept modeling and syntactic modeling. The concept model is conditioned on the last question prompted by the dialog system and it is structured using n-grams. The syntactic model , which consists of a collection of stochastic context-free grammars one for each concept, describes word sequences that may be used to express the concepts. The resulting LM is evaluated by rescoringN-best lists. We report significant perplexity improvement with moderate word error rate drop within the context of CU Communicator System; a dialog system for making travel plans by accessing information about flights, hotels and car rentals.
منابع مشابه
Combination Of N-Grams And Stochastic Context-Free Grammars For Language Modeling
This paper describes a hybrid proposal to combine n-grams and Stochastic Context-Free Grammars (SCFGs) for language modeling. A classical n-gram model is used to capture the local relations between words, while a stochastic grammatical model is considered to represent the long-term relations between syntactical structures. In order to de ne this grammatical model, which will be used on large-vo...
متن کاملAn Empirical Evaluation of Probabilistic Lexicalized Tree Insertion Grammars
We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Grammars (PLTIG), a lexicalized counterpart to Probabilistic Context-Free Grammars (PCFG), to problems in stochastic naturallanguage processing. Comparing the performance of PLTIGs with non-hierarchicalN -gram models and PCFGs, we show that PLTIG combines the best aspects of both, with language modeli...
متن کامل-gram Probabilities from Stochastic Context-free Grammars
We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic structure, among others). The method operates via the computation of substring expectations, which in turn is accomplished by solving systems of linear equations der...
متن کاملPrecise N-Gram Probabilities from Stochastic Context-Free Grammars
We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic structure, among others). The method operates via the computation of substring expectations, which in turn is accomplished by solving systems of linear equations der...
متن کاملStochastic Lexicalized Tree-adjoining Grammars
The notion of stochastic lexicalized tree-adjoining g rammar (SLTAG) is formally defined. The parameters of a SLTAG correspond to the probability of combining two structures each one associated with a word. The characteristics of SLTAG are unique and novel since it is lexieally sensitive (as N-gram models or Hidden Markov Models) and yet hierarchical (as stochastic context-free grammars) . Then...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001